Search CORE

818 research outputs found

Incremental refinement of image salient-point detection

Author: Andreopoulos Y
Patras I
Publication venue
Publication date: 01/01/2008
Field of study

Low-level image analysis systems typically detect "points of interest", i.e., areas of natural images that contain corners or edges. Most of the robust and computationally efficient detectors proposed for this task use the autocorrelation matrix of the localized image derivatives. Although the performance of such detectors and their suitability for particular applications has been studied in relevant literature, their behavior under limited input source (image) precision or limited computational or energy resources is largely unknown. All existing frameworks assume that the input image is readily available for processing and that sufficient computational and energy resources exist for the completion of the result. Nevertheless, recent advances in incremental image sensors or compressed sensing, as well as the demand for low-complexity scene analysis in sensor networks now challenge these assumptions. In this paper, we investigate an approach to compute salient points of images incrementally, i.e., the salient point detector can operate with a coarsely quantized input image representation and successively refine the result (the derived salient points) as the image precision is successively refined by the sensor. This has the advantage that the image sensing and the salient point detection can be terminated at any input image precision (e.g., bound set by the sensory equipment or by computation, or by the salient point accuracy required by the application) and the obtained salient points under this precision are readily available. We focus on the popular detector proposed by Harris and Stephens and demonstrate how such an approach can operate when the image samples are refined in a bitwise manner, i.e., the image bitplanes are received one-by-one from the image sensor. We estimate the required energy for image sensing as well as the computation required for the salient point detection based on stochastic source modeling. The computation and energy required by the proposed incremental refinement approach is compared against the conventional salient-point detector realization that operates directly on each source precision and cannot refine the result. Our experiments demonstrate the feasibility of incremental approaches for salient point detection in various classes of natural images. In addition, a first comparison between the results obtained by the intermediate detectors is presented and a novel application for adaptive low-energy image sensing based on points of saliency is presented

CiteSeerX

UCL Discovery

Fine-Tuning Regression Forests Votes for Object Alignment in the Wild

Author: Patras I
Yang H
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/03/2017
Field of study

Crossref

Queen Mary Research Online

Prompting Visual-Language Models for Dynamic Facial Expression Recognition

Author: Patras I
Zhao Z
Publication venue
Publication date: 25/08/2023
Field of study

This paper presents a novel visual-language model called DFER-CLIP, which is based on the CLIP model and designed for in-the-wild Dynamic Facial Expression Recognition (DFER). Specifically, the proposed DFER-CLIP consists of a visual part and a textual part. For the visual part, based on the CLIP image encoder, a temporal model consisting of several Transformer encoders is introduced for extracting temporal facial expression features, and the final feature embedding is obtained as a learnable "class" token. For the textual part, we use as inputs textual descriptions of the facial behaviour that is related to the classes (facial expressions) that we are interested in recognising – those descriptions are generated using large language models, like ChatGPT. This, in contrast to works that use only the class names and more accurately captures the relationship between them. Alongside the textual description, we introduce a learnable token which helps the model learn relevant context information for each expression during training. Extensive experiments demonstrate the effectiveness of the proposed method and show that our DFER-CLIP also achieves state-of-the-art results compared with the current supervised DFER methods on the DFEW, FERV39k, and MAFW benchmarks. Code is publicly available at https://github.com/zengqunzhao/DFER-CLIP

Queen Mary Research Online

LEARNING VISUAL SALIENCY USING TOPOGRAPHIC INDEPENDENT COMPONENT ANALYSIS

Author: IEEE
Patras I
Stefic D
Publication venue
Publication date: 06/06/2016
Field of study

Queen Mary Research Online

UNSUPERVISED CONVOLUTIONAL NEURAL NETWORKS FOR MOTION ESTIMATION

Author: Ahmadi A
IEEE
Patras I
Publication venue
Publication date: 14/01/2017
Field of study

Queen Mary Research Online

Dynamics of facial expression: recognition of facial actions and their temporal segments from face profile image sequences

Author: I. Patras
M. Pantic
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Deep globally constrained MRFs for Human Pose Estimation

Author: IEEE
Marras I
Palasek P
Patras I
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/11/2017
Field of study

Queen Mary Research Online

Spatiotemporal saliency for human action recognition

Author: Oikonomopoulos A
Pantic M
Patras I
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Crossref

Spiral - Imperial College Digital Repository

Linear Maximum Margin Classifier for Learning from Uncertain Data

Author: Mezaris V
Patras I
Tzelepis C
Publication venue
Publication date: 10/11/2017
Field of study

In this paper, we propose a maximum margin classifier that deals with uncertainty in data input. More specifically, we reformulate the SVM framework such that each training example can be modeled by a multi-dimensional Gaussian distribution described by its mean vector and its covariance matrix -- the latter modeling the uncertainty. We address the classification problem and define a cost function that is the expected value of the classical SVM cost when data samples are drawn from the multi-dimensional Gaussian distributions that form the set of the training examples. Our formulation approximates the classical SVM formulation when the training examples are isotropic Gaussians with variance tending to zero. We arrive at a convex optimization problem, which we solve efficiently in the primal form using a stochastic gradient descent approach. The resulting classifier, which we name SVM with Gaussian Sample Uncertainty (SVM-GSU), is tested on synthetic data and five publicly available and popular datasets; namely, the MNIST, WDBC, DEAP, TV News Channel Commercial Detection, and TRECVID MED datasets. Experimental results verify the effectiveness of the proposed method.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence. (c) 2017 IEEE. DOI: 10.1109/TPAMI.2017.2772235 Author's accepted version. The final publication is available at http://ieeexplore.ieee.org/document/8103808

arXiv.org e-Print Archive

City Research Online

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Queen Mary Research Online

Motion history for facial action detection in video

Author: Pantic M
Patras I
Valstar M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2004
Field of study

Crossref

Spiral - Imperial College Digital Repository